10th World Congress in Probability and Statistics
Invited Session (live Q&A at Track 1, 11:30AM KST)
Invited 38
IMS Lawrence D. Brown Ph.D. Student Award Session (Organizer: Institute of Mathematical Statistics)
Conference
11:30 AM
—
12:00 PM KST
Local
Jul 21 Wed, 7:30 PM
—
8:00 PM PDT
Efficient manifold approximation with spherelets
Didong Li (Princeton University / University of California)
5
Data lying in a high dimensional ambient space are commonly thought to have a much lower intrinsic dimension. In particular, the data may be concentrated near a lower-dimensional subspace or manifold. There is an immense literature focused on approximating the unknown subspace, and in exploiting such approximations in clustering, data compression, and building of predictive models. Most of the literature relies on approximating subspaces using a locally linear, and potentially multiscale, dictionary. In this talk, a simple and general alternative is introduced, which instead uses pieces of spheres, or spherelets, to locally approximate the unknown subspace. Theory is developed showing that spherelets can produce lower covering numbers and MSEs for many manifolds. We develop spherical principal components analysis (SPCA). Results relative to state-of-the-art competitors show gains in ability to accurately approximate the subspace with fewer components. In addition, unlike most competitors, our approach can be used for data denoising and can efficiently embed new data without retraining. The methods are illustrated with standard toy manifold learning examples, and applications to multiple real data sets.
Toward instance-optimal reinforcement learning
Ashwin Pananjady (Georgia Institute of Technology)
3
The paradigm of reinforcement learning has now made inroads in a wide range of applied problem domains. This empirical research has revealed the limitations of our theoretical understanding: popular RL algorithms exhibit a variety of behavior across domains and problem instances, and existing theoretical bounds, which are generally based on worst-case assumptions, can often produce pessimistic predictions. An important goal is thus to develop instance-specific analyses that help to reveal what aspects of a given problem make it "easy" or "hard", and allow distinctions to be drawn between ostensibly similar algorithms. Taking an approach grounded in nonparametric statistics, we initiate a study of this question for the policy evaluation problem. We show via information-theoretic lower bounds that many popular variants of stochastic approximation or "temporal difference learning" algorithms *do not* exhibit the optimal instance-specific performance in the finite-sample regime. On the other hand, making careful modifications to these algorithms does result in automatic adaptation to the intrinsic difficulty of the problem. When there is function approximation involved, our bounds also characterize the instance-optimal tradeoff between approximation and estimation errors in solving projected fixed-point equations, a general class of problems that includes policy evaluation as a special case. These oracle inequalities, which are non-standard and involve a non-unit pre-factor multiplying the approximation error, may be of independent statistical interest.
Bayesian pyramids: identifying interpretable discrete latent structures from discrete data
Yuqi Gu (Columbia University)
3
High dimensional categorical data are routinely collected in biomedical and social sciences. It is of great importance to build interpretable models that perform dimension reduction and uncover meaningful latent structures from such discrete data. Identifiability is a fundamental requirement for valid modeling and inference in such scenarios, yet is challenging to address when there are complex latent structures. In this article, we propose a class of interpretable discrete latent structure models for discrete data and develop a general identifiability theory. Our theory is applicable to various types of latent structures, ranging from a single latent variable to deep layers of latent variables organized in a sparse graph (termed a Bayesian pyramid). The proposed identifiability conditions can ensure Bayesian posterior consistency under suitable priors. As an illustration, we consider the two-latent-layer model and propose a Bayesian shrinkage estimation approach. Simulation results for this model corroborate identifiability and estimability of the model parameters. Applications of the methodology to DNA nucleotide sequence data uncover discrete latent features that are both interpretable and highly predictive of sequence types. The proposed framework provides a recipe for interpretable unsupervised learning of discrete data, and can be a useful alternative to popular machine learning methods.
Q&A for Invited Session 38
0
This talk does not have an abstract.
Session Chair
Tracy Ke (Harvard University)